38 research outputs found
Alpha MAML: Adaptive Model-Agnostic Meta-Learning
Model-agnostic meta-learning (MAML) is a meta-learning technique to train a
model on a multitude of learning tasks in a way that primes the model for
few-shot learning of new tasks. The MAML algorithm performs well on few-shot
learning problems in classification, regression, and fine-tuning of policy
gradients in reinforcement learning, but comes with the need for costly
hyperparameter tuning for training stability. We address this shortcoming by
introducing an extension to MAML, called Alpha MAML, to incorporate an online
hyperparameter adaptation scheme that eliminates the need to tune meta-learning
and learning rates. Our results with the Omniglot database demonstrate a
substantial reduction in the need to tune MAML training hyperparameters and
improvement to training stability with less sensitivity to hyperparameter
choice.Comment: 6th ICML Workshop on Automated Machine Learning (2019
Domain invariant representation learning with domain density transformations
Domain generalization refers to the problem where we aim to train a model on data from a set of source domains so that the model can generalize to unseen target domains. Naively training a model on the aggregate set of data (pooled from all source domains) has been shown to perform suboptimally, since the information learned by that model might be domain-specific and generalize imperfectly to target domains. To tackle this problem, a predominant approach is to find and learn some domain-invariant information in order to use it for the prediction task. In this paper, we propose a theoretically grounded method to learn a domain-invariant representation by enforcing the representation network to be invariant under all transformation functions among domains. We also show how to use generative adversarial networks to learn such domain transformations to implement our method in practice. We demonstrate the effectiveness of our method on several widely used datasets for the domain generalization problem, on all of which we achieve competitive results with state-of-the-art models
High-Cadence Thermospheric Density Estimation enabled by Machine Learning on Solar Imagery
Accurate estimation of thermospheric density is critical for precise modeling
of satellite drag forces in low Earth orbit (LEO). Improving this estimation is
crucial to tasks such as state estimation, collision avoidance, and re-entry
calculations. The largest source of uncertainty in determining thermospheric
density is modeling the effects of space weather driven by solar and
geomagnetic activity. Current operational models rely on ground-based proxy
indices which imperfectly correlate with the complexity of solar outputs and
geomagnetic responses. In this work, we directly incorporate NASA's Solar
Dynamics Observatory (SDO) extreme ultraviolet (EUV) spectral images into a
neural thermospheric density model to determine whether the predictive
performance of the model is increased by using space-based EUV imagery data
instead of, or in addition to, the ground-based proxy indices. We demonstrate
that EUV imagery can enable predictions with much higher temporal resolution
and replace ground-based proxies while significantly increasing performance
relative to current operational models. Our method paves the way for
assimilating EUV image data into operational thermospheric density forecasting
models for use in LEO satellite navigation processes.Comment: Accepted at the Machine Learning and the Physical Sciences workshop,
NeurIPS 202
KL guided domain adaptation
Domain adaptation is an important problem and often needed for real-world applications. In this problem, instead of i.i.d. training and testing datapoints, we assume that the source (training) data and the target (testing) data have different distributions. With that setting, the empirical risk minimization training procedure often does not perform well, since it does not account for the change in the distribution. A common approach in the domain adaptation literature is to learn a representation of the input that has the same (marginal) distribution over the source and the target domain. However, these approaches often require additional networks and/or optimizing an adversarial (minimax) objective, which can be very expensive or unstable in practice. To improve upon these marginal alignment techniques, in this paper, we first derive a generalization bound for the target loss based on the training loss and the reverse Kullback-Leibler (KL) divergence between the source and the target representation distributions. Based on this bound, we derive an algorithm that minimizes the KL term to obtain a better generalization to the target domain. We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples without any additional network or a minimax objective. This leads to a theoretically sound alignment method which is also very efficient and stable in practice. Experimental results also suggest that our method outperforms other representation-alignment approaches
KL Guided Domain Adaptation
Domain adaptation is an important problem and often needed for real-world
applications. In this problem, instead of i.i.d. datapoints, we assume that the
source (training) data and the target (testing) data have different
distributions. With that setting, the empirical risk minimization training
procedure often does not perform well, since it does not account for the change
in the distribution. A common approach in the domain adaptation literature is
to learn a representation of the input that has the same distributions over the
source and the target domain. However, these approaches often require
additional networks and/or optimizing an adversarial (minimax) objective, which
can be very expensive or unstable in practice. To tackle this problem, we first
derive a generalization bound for the target loss based on the training loss
and the reverse Kullback-Leibler (KL) divergence between the source and the
target representation distributions. Based on this bound, we derive an
algorithm that minimizes the KL term to obtain a better generalization to the
target domain. We show that with a probabilistic representation network, the KL
term can be estimated efficiently via minibatch samples without any additional
network or a minimax objective. This leads to a theoretically sound alignment
method which is also very efficient and stable in practice. Experimental
results also suggest that our method outperforms other representation-alignment
approaches